swedish po-tagger
Training a Swedish POS-tagger for Stanford CoreNLP – Andreas Klintberg – Medium
This will be a very short tutorial on how to train a CoreNLP POS model for Swedish, as it does not exist one for CoreNLP "package" and I haven't found one open source out there just yet. From Wikipedia: "part-of-speech tagging (POS tagging or POST), also called grammatical tagging or word-category disambiguation, is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech, based on both its definition, as well as its context" It is also sometimes called shallow parsing, since it is not creating a deeper structure of the different parts of the sentence. A sort of POS tagging is what you are learnt the first years in school, in the identification of words as nouns, verbs, adjectives, adverbs. First we need some training data for our Swedish POS-tagger, I've used the http://stp.lingfil.uu.se/ nivre/swedish_treebank/ for the Talbanken part, they also provide a conversion to Stanford dependencies. After we've downloaded it, we get two files, After you've downloaded POS-tagger part (use the -full, to get all the models, german and french etc) it's time to create your .props